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TWO-COLOUR BALANCED AFFINE URN MODELS WITH MULTIPLE 
DRAWINGS I: CENTRAL LIMIT THEOREMS 

MARKUS KUBA AND HOSAM M. MAHMOUD 


Abstract. This is a research endeavor in two parts. We study a class of balanced urn schemes 
on balls of two colours (say white and black). At each drawing, a sample of size m > 1 is drawn 
from the urn, and ball addition rules are applied. We consider these multiple drawings under 
sampling with or without replacement. We further classify ball addition matrices according to the 
structure of the expected value into affine and nonaffine classes. We give a necessary and sufficient 
condition for a scheme to be in the affine subclass. For the affine subclass, we get explicit results 
for the expected value and second moment of the number of white balls after n steps and an 
asymptotic expansion of the variance. Moreover, we uncover a martingale structure, amenable to 
a central limit theorem formulation. This unifies several earlier works focused on special cases of 
urn models with multiple drawings [5, 6, 17, 20, 21, 24]. The class is parametrized by A, specified 
by the ratio of the two eigenvalues of a “reduced” ball replacement matrix and the sample size. 
We categorize the class into small-index urns (A < 5 ). critical-index urns (A = ^), and large- 
index urns (A > *), and triangular urns. In the present paper (Part I), we obtain central limit 
theorems for small- and critical-index urns and prove almost-sure convergence for triangular and 
large-index urns. In a companion paper (Part II), we discuss the moment structure of large-index 
urns and triangular urns. 


1. Introduction 

Urn schemes are simple, useful and versatile mathematical tools for modeling many evolu¬ 
tionary processes in diverse applications such as algorithmics, genetics, epidimiology, physics, 
engineering, economics, networks (social and other types), and many more. Modeling via urns 
is centuries old, but perhaps the earliest contributions in the flavor commonly called Polya urns 
(the subject of the present paper) are [7, 8]. In the first of these two classics, urns were intended 
to model the diffusion of gases. In the second, urns were meant to model contagion. Many Polya 
urn models useful for numerous applications were added later on. In fact, they are too many (lit¬ 
erally hundreds) to be listed individually. The sources [13, 16] are classic surveys listing many of 
these applications; see also [19], where two chapters are devoted to applications in algorithmics 
and bio sciences. 

While the term “Polya urn” refers to a vast variety of schemes, there is a common thread 
among most of them. Urns of the classic flavor on two colours (say white and black) evolve 
in the following way. At the beginning, time zero, the urn contains a certain number of white 
and black balls. Thereafter, evolution of the urn occurs in discrete time steps. At every step, 
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a ball is chosen at random from the um. The colour of the ball is inspected, then the ball is 
reinserted in the urn. According to the colour of the sampled ball, other balls are added/removed 
following certain rules—if we have chosen a white ball, we put in the um a white balls and b 
black balls, but if we have chosen a black ball, we put in the um c white balls and d black balls. 
The values a,b,c,d G Z are fixed. The urn model is specified by the 2x2 ball replacement 
'a b N 
v c d , 

and the number of black balls TL 


matrix M = 


One is usually interested in the number of white balls W n after n draws, 
after n draws. 


1.1. Polya urn models with multiple drawings. In the classic version of Polya urns, one ball 
is sampled at each unit of (discrete) time. The present work is devoted to the study of a gen¬ 
eralization of the Polya urn model, where multiple balls are drawn at each discrete time step, 
their colours are inspected, then the sample is reinserted in the urn. Additions and deletions take 
place according to the drawn sample (multiset). Such urn models recently received attention 
in the literature, see for example [5, 6, 14, 17, 20, 21, 23, 24]. The addition/removal of balls 
depends on the combinations of colours of the drawn balls. We use the notation {W k B m ~ k } 
to refer to a sample of size m containing k white balls and m — k black balls. Specifically, we 
draw m > 1 balls and add/remove white and black balls according to the multiset {W k B m ~ k } 
of observed colours: If we draw k white and m — k black balls, we add a m _ k white and b m _ k 
black balls, 0 < k < m. The ball replacement matrix of this um model with multiple drawings 
is a rectangular (m + 1) x2 matrix: 


M 


/ a o b 0 \ 

a i w 

(t"in—l b m — i 

V b m ) 


( 1 ) 


We assume throughout that the urn model is balanced , such that the overall number of added/ 
removed balls is a constant cr, independent of the composition of the sample: a k + b k = a > 1, 
0 < k < m. Moreover, we are only interested in so-called tenable urn models, where the process 
of drawing and replacing balls can be continued ad infinitum. Several of the afore-mentioned 
works on um models with multiple drawings were only concerned with a specific um model. 
This includes an um model related to logic circuits [21, 24], the generalized Polya-Eggenberger 
urn [5, 6], and the generalized Friedman urn [17]. In this work, we unify and generalize these 
earlier works. We do so by discussing a more general model encompassing all the previously 
mentioned specific urns. 


1.2. Plan of the paper and notation. The main ingredient for our analysis is to specify all 
(m + 1) x 2 ball replacement matrices for which the conditional expectation of the number of 
white balls W n after n draws has an affine structure of the form 

E[W n | F n _i] = a n W n -i + f3 n , n > 1, 

for certain deterministic sequences a n , f3 n , where IF',, denotes the cr-algebra generated by the 
first n draws from the urn. So, we are considering a class of two-colour balanced tenable affine 
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urns, grown under sampling multisets. Beside such characterization, we also present a central 
limit theorem for W n for urns in this class with small and critical index, a parameter that will be 
defined in the sequel. We shall return soon in a companion paper [18] to deriving more families 
of limit laws concerning urns in the class completing the analysis of limit laws. In particular, 
we discuss urn models with a large index and triangular ums, using the so-called method of 
moments. 

We denote by x k the kth falling factorial, x(x — 1)... (x — k + 1), k > 0, with x- = 1. We 
shall also use V, the backward difference operator, defined by V h n = h n — h n - 1 , when acting 
on a function h n . 


2. Preliminaries 


2.1. Sampling schemes. Assume that an urn contains w white and b black balls. We consider 
two different sampling schemes for drawing the m balls at each step: model Ad and model TZ. 
In model JVi we draw the m balls without replacement. The rn balls are drawn at once and their 
colours are examined. After the sample is collected, we put the entire sample back in the urn 
and execute the replacement rules according to the counts of colours observed. The tenability 
assumption implies that for model Ai the coefficients a*. of the ball replacement matrix (1) satisfy 
the condition > — (m — k), 1 for 0 < k < m. 

The probability F(W k B m ~ k ) of drawing k white and nn — k black balls is given by 


P (W k B m - k ) = - -0 <k<m. 

(b + w)m\kj (‘+”) - - 

Thus A", the number of white balls in the sample, follows a hypergeometric distribution, with 
parameters w + b, w, and m, that is, one that counts the number of white balls in a sample of 
size ??? balls taken out of an um containing w white and b black balls (a total of r = w + b balls). 
The expected value and second moment are given by 


E[A] = m —, 
r 


E[X 2 ] 


w(w — 1 )m{m 
t(t - 1) 


1) wm 

— +- 

r 


In model TZ, we draw the m balls with replacement. The m balls are drawn one at a time. After 
a ball is drawn, its colour is observed, and is reinserted in the urn, and thus it might reappear in 
the sampling of one multiset. After m balls are collected in this way (and they are all back in 
the um), we execute the replacement rules according to the counts of colours observed. By the 
tenability assumption ak > —1, for 0 < k < m — 1 and a m > 0, for model TZ. 

The probability F{W k B m ~ k ) of drawing k white and m — k black balls is given by 

P (W k B m - k ) = - - ( ) w k b m ~ k , 0 < k < m. 

K J (b + w) m \kj ~ ~ 


'These assumptions can be relaxed a little bit, if the initial values Wq and B a are adapted to the entries in the ball 
replacement matrix. E.g., for m = 1, the urn model with ball replacement matrix ( 8 ,) is still tenable, if Wq is 

a multiple of 3 and Bq a multiple of 4. 
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In other words, under model 72, the number of white balls in the multiset of size m follows a 
binomial distribution with parameters m, and w/t, one that counts the number of successes in m 
independent identically distributed experiments, with w/t probability of success per experiment. 
Let Y denote such a binomially distributed random variable. The expected value and second 
moment are given by 


E[F] = m —, 
r 


E[Y' 2 ] 



0 w 

+ m 

T l 


2.2. Stochastic recurrence. We start with W 0 white and B 0 black balls, W 0 , B 0 6 N 0 assuming 
that W 0 + B 0 > m, to enable at least the first draw. Thereafter, tenability guarantees the perpet¬ 
uation of drawing. We are interested in the distribution of the numbers W n and B n of white and 
black balls after n draws, respectively. We denote by 

T n = W n + B n , n > 0, 

the total number of balls contained in the um after n draws. As we are considering a class of 
balanced ums, the total number of balls T n after n draws is a deterministically linear: 

T n = an + T 0 , n > 0. 

We restrict ourselves to the case where the total number of balls increases after each draw, in 
other words we consider a > 1. 

In what follows, we use the notation I n (W k B m ~ k ) to stand for the indicator of the event that 
the multiset { W k B m ~ k } is drawn in the nth sampling. Conditioning on the composition of the 
urn after n — 1 draws, we obtain a stochastic recurrence for W n . The number of white balls 
after n draws is the number of white balls after n — 1 draws, plus the contribution of white balls 
after the nth sample is obtained: 


W n = W n ^ + J2 am-k UW k B m ~ k ), n > 1. 


( 2 ) 


k =0 


Let JF„- i denote the a-field generated by the first n — 1 draws. For 0 < k < m the indicator 
variables I n {W k B m ~ k ) satisfy 


fWn-1\ {B n - 1 \ {W n -l\ (T n -l — W n -l\ 

yyk l|IF ^ ^ ^ V m—k) \ k /V m—k ) 


for model AT, and 


V m ) \ m ) 

= 11 F„ ,) = ^ - M w ti(T n -1 - 


1 n -1 


1 n -1 


(3) 


(4) 


for model 72. We obtain for W*, s > 1, a stochastic recurrence by taking the sth power of (2), 
and using the fact that the indicator variables are mutually exclusive: 


e=o 


w; = V ’W; 


n > 1. 


(5) 


k =0 
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3. Affine expectation 


We classify ball replacement matrices according to the structure of the conditional expected 
value. Our motivation is that all previously treated specific urn models with multiple draw¬ 
ings [5, 6, 17, 20, 21, 24] had one feature in common, namely a simple recurrence relation for 
the conditional expectation of an affine form 

E[l W n | F n _i] = a n W n -1 + fi n , n > 1, 

where a n and fi n are certain deterministic sequences. It is desired to unify all the earlier special 
cases into a single simple model, and find a more general theory to work as an umbrella for these 
special cases and other special cases that may be equally important in application. In [20], a char¬ 
acterization of all ball replacement matrices giving rise to an affine linear conditional expected 
value was given for the case of drawing m = 2 balls, under sampling without replacement. We 
extend this analysis in the next subsection to arbitrary m > 1, for both sampling models and char¬ 
acterize all ball replacement matrices leading to such a simple relation. (Note that our results stay 
valid for m — 1; here our model reduces to ordinary balanced urn models.) Subsequently, this 
allows us to obtain closed formulae for the expected value and second moment, and to uncover 
an associated martingale structure. Later on, this is exploited to obtain limit theorems. 


3.1. A necessary and sufficient condition for average affinity. We obtain, for 0 < k < m, a 

necessary and sufficient condition on the numbers a k , b k for the conditional expectation to take an 
affine form, reducing the number of significant parameters to three: a m _i, a m and the balance a. 


Proposition 1. Suppose we are given the numbers a m _i and a m , and the balance factor a = 
a k + b k > 0. For both sampling schemes, the random variable W n satisfies a linear affine 
relation of the form 

E[Wn | F n _i] = a n W n -l + fin, U > 1, 
if and only if for 0 < k < m, the numbers a k satisfy the condition 

a k = (m - k)a m -1 - (m - k - 1 )a m . 

Equivalently, the coefficients a k themselves satisfy an affinity condition: 

ci k = a o + hk , 


with h (and h = a ’ n a ° ) an integer guaranteeing tenability. The sequences cc n and fi n are given 
in terms of a rn -\- a m and T n by 




Tji—1 + O^m) 

Tn -1 


and fi n = a m , n > 1. 


For technical reasons we assume from this point on that for balanced affine urn models the 
factors a n , as stated in Proposition 1, satisfy a n > 0 for n > 1. Equivalently, we make the 
assumption T 0 + rn(a m - i — a m ) > 0. In view of tenability and the steady increase of balls 
(cr > 1) this is a natural assumption and not really a restriction. If for a certain model T 0 + 
m(a m _i — a m ) < 0, after only a few draws (say j 0 > 1), we will have T jo + m(a m _i — a m ) > 0. 
We then restart the um and take j 0 as the new beginning of time. 
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An immediate consequence of the affinity condition is the appearance of a martingale, and 
simple closed formulae for the expected value and the variance. Moreover, by appropriate choices 
of the parameters a m _i, a m and the balance factor a, the affinity condition covers many of the 
previously treated specific um models with multiple drawings. 

Example 1. Let a m = a m _i = c. We obtain = c for 0 < k < m, such that the random 
variable W n degenerates to a deterministic value: W n = W 0 + nc. 

Example 2. For m — 2, we obtain the condition ao — 2ai + <22 = 0; this affinity condition is 
discussed in [20], which only considers model Ad. 

Example 3. For a m = me, a m _ 1 = (m — l)c and a = me, we obtain the generalized Friedman 
urn model with ak = kc, as discussed in [17] under both sampling schemes. 

Example 4. For a m = 0, a m _ 1 = c and a = me, we obtain the generalized Polya urn model 
with a k = (m — k)c, as discussed in [5, 6]. 

Example 5. For a m = 1, a m _ 1 = 0 and a — 1, we obtain ak = — (m — k) + 1, an urn model for 
logic circuits treated in [21, 24]. 

In order to prove Proposition 1, we first determine the general structure of the conditional 


expectation. 


Lemma 1. For both sampling schemes, the conditional expected value of the random vari¬ 
able W n is a polynomial of degree m (the sample size) in W n - \: 


m 


E[W n |F n _J =Y J fn,Wn_ 1 , n> 1. 


i =0 

The values f nj are model dependent. For model 7 Z, we get 



For model M., we get 


1 


m 



where the polynomials p m ,j(x) are, for 0 < j < m, given by 



Proof Our starting point is the relation 


m 


E[W n | F n _J = W n ^ + J2 a m-kE[l n (W k B m - k ) | F^j]. 
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We discuss first the proof for model 1Z , which is simpler. According to (4) we get 

' m\ - ME-i)” 1 "* 


E[W n |F n _J = ME-i + ^ 


O'm.—k 


k =0 


k 


rprn 
1 n -1 


Expanding (T n _i — ME-i) m k by the binomial theorem, and changing the order of summation 
yields 


Q"m—k 


m\ (m — k 


k ) \m — i 


m\ (m — k 


k V m — i 


m i 

E [w„ I F„_J = WV! + — E T n-i‘ Y. 

n_1 i= 0 k =0 

Consequently, the conditional expectation satisfies the equation 

771 /_l\i 7 

E[ME | F n _i] = ME-i + ^ W^a^k 

i= 0 n_1 k =0 

which gives the claimed formula for f n i . 

For model M , from (3) we have 

E[ME | F n _!] = ME-i + V a m _ k m - ME-i) 

1 n-l k=Q W 

Next, we use the binomial theorem for the falling factorials to obtain 

^ 171 /_ \ m—k 

E \W n | F n _i] = ME-1 + m Oim—k 

In ~ 1 k =0 

Changing the order of summation gives 


, (- 1 ) 


, (-If, 


m—k 


III—K / y \ 

wf-1 E 7 ) ui-1 (- MA-1 )=5=i=i- 

7=o \ / 


m-j 


E [W„ | F„_,] = + ™- E T »-i E 


/Tim / 
n -! j=0 


^m—k 


k =0 


VE- , 

r y n—1 


m — k 
3 


(-ME-i)^E 


The inner sum on the right-hand side is exactly the polynomial p m ,j(W n - 1 ). The polynomials 
can be expanded into powers of ME-i, leading to the stated result. □ 


Proof of Proposition 1. Given the numbers a m _i and a m , we need to ensure that the conditional 
expected value of W n only involves ME - i and constants, but no higher powers of ME -1 • By 
Lemma 1, this is equivalent to the condition f n i = 0, 2 < i < m. It remains to show that 
this condition is fulfilled, if and only if the coefficients of a ball replacement matrix satisfy the 
stated condition a k = (m — k)a m -\ — (m — k — 1 )a m . Note that by collecting the coefficient k 
and expressing a m _i in terms of a 0 = m(a m _i — a m ) + a m , we have the equivalent condition 
a k = hk + a 0 , with arbitrary a 0 and h satisfying tenability. 

We start with model 1Z. By Lemma 1 the condition f n>l = 0, 2 < i < m, implies the following 
linear equations for the numbers a k , 0 < k < m — 2, independent of T n _ \ and thus independent 
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of n, too: 


E fm\ f m — k\, 1 (m - 1 

k ( ; )(_ . )( 1) T7l\ . ] Cl m —i 


k =2 


k ) \m — i 


m — i 


m 

m — i 


2 < i < m. 


This system of linear equations is upper triangular and has a unique solution. The solution can be 
obtained by Cramer’s rule. However, in order to avoid more involved calculations, we can check 
that the stated solution = (m — k)a m _i — (m — k — 1 )a m satisfies the equations by simple 
algebraic manipulations, which are omitted here. For model AT by contrast to the previous case, 
the m —l equations f U)l — 0, 2 < i < m, are not independent of n, since they involve T n _ ]: 

^ m 

fn,i = rptn ^ ^ T~_ i [x ]Pm,j( x )i 

±n ~ l j =0 

2 < i < m. In order to ensure that = 0 for all n, with 2 < i < m, the coefficient [x z ] p m j{x) 

j 

of the falling factorials T~_ , have to vanish for all n. Assume conversely that there exists a 
largest j = j 0 , 1 < j < m, such that [x l ] p m j(x) T 0. Then, for large n, we have 

1 , i 

fn,i = Tpm-^^-A^PmAx) = 

1 n -1 


rj-iin / 

±n ~ 1 j =0 


rjljQ 

such that f Uji ~ 7^-[x l ] p m ,j 0 ( x ) ^ 0- Thus, we obtain the system of equations 


m-j 

[X Z ] Prn,j{ x ) = [A] ^ a m-k ( " k ) X ~ 
k =0 ' 


m\ k /m — k 


J 


-x)^^ = 0, 


for 2 < i < m and 0 < j < m. This leads to an overdetermined system of linear equations 
for the coefficients a*. Instead of writing the whole system, it is sufficient to derive an exactly 
solvable subsystem of equations involving all the coefficients a^, 0 < k < m. In order to do so, 
we concentrate on the equations arising from the coefficient of x m ~ ] . This is the highest power 
of x in the polynomials p m , 3 (x). We get 


m-j 


[X m - j ]p m ,j{x) = [X m ~ j ] Y, (-xT~ k -i 

k =0 ' ' V ^ 


m-j 

y ^ O'm—k 
k =0 


m\ (m — k 


k) V J 


■1) 


m—k—j 


0 < j < m. We allow a m _i, and a m to be freely chosen. Setting j — m — i leads to a the system 
of equations for the numbers with 0 < k < m — 2: 

E (Z " f) (-1)’”- 1 = 0. 2 < i < m. 


k =0 


k J \ m — i 
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This system coincides with the system of equations previously derived for sampling with re¬ 
placement. It has the stated unique solution. Hence, the overdetermined system of equations 

[x l ] p m ,j(x ) = 0 , 2 < i < m, 0 < j < m, 


has either exactly one solution or no solution at all. It remains to show that coefficients satisfying 
the affinity condition a k = ( m — k)a rn _ \ — (m — k — l)a m lead to a solution. Starting from (3) 
we get 


E^nlFn-i] 


m lW n -A 

W n -i + {k(a m ~i — a m ) + a m ) —-— 7f~iS~ - 

k =0 V m ) 


Next, we use Vandermonde’s convolution formula 



and obtain 


E[W re | F n _i] = W n -1 + W n -i(a m -i — a m ) /t „\—^ ar ‘ 

V m / 


□ 


3.2. Expected value and second moment. Next, we generalize the result of Bagchi and Pal [1] 
for the expected value and the second moment, when drawing a single ball (the case m — 1) to 
balanced affine urn models with multiple drawings. In order to state our result we introduce the 
quantity g n given by 


T, 


n— 1 

9 " T j + ">(»»-! - «■) 




n i | T 0 + m(a m _ 1 -a m ) 


F(n + + r( 
r(?)r(n + 


T{)+m(a m — 1 Q-m) 
_ a _ 

To+m(q m — 1 Ctm) 
a 


) 

)' 


( 6 ) 


Proposition 2. The expected value of the random variable W n , counting the number of white 
balls in a two-color balanced affine urn model with multiple drawings, is for both sampling 
models A4 and 1Z given by E[ W„] = — V'”_ 1 g, + Wo~, with g n as given stated above in (6). 

L J Qn z —*J 1 J Qn 

For < i ; we have the closed form expression 


r , a m (n + 

w = . .I,.-, -L + ( 


i - 


Q-m^O 


— 1 a m) 


1 - 


fn— 1+ 


TQ+m(a rn i — am) 


r‘.E) 


as well as the asymptotic expansion 

dm. O’ 


E[W n ] = 


n 


<7 Tn{ci m —\ a , m ) 

a m^0 


— 1 a m ) 


1 - 


r(5) 


r( 


T 0 +m(a 

m — 1 Q-m) 


m(a m _ 1 —a m ) 

■n 5 +0(1), 


Moreover, for = 1 wg obfain = b+. 
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Proof of Proposition 2. From Proposition 1 we get 

T n —\ “1“ r m( y CL rn _i dm) 


E[W n ] = 


T n ~ i 


E[W n -i] + a m , n> 1. 


(7) 


Multiplication with g n as defined in (6) gives the recurrence relation 

9n E[W 7 ?!,] 3n-l ®[^n-l] “1“ Qndmi 

such that 


9 


Applying the summation formula 


E[w n ] = ^j2g j + Wo i - 


9 o d r 


3 = 1 


9r 


9r 


Y.'Ji +w « 


( 8 ) 


3 = 1 


X + 1 


A (*r) . (»+i+y)c:ir) _ 

CV) (x + 1 - y)(‘ +1 , +v ) + 1 x + l-y’ 


to the sum involving g n , which has the form (”^ x ) / ( n ^), with x = A — 1 and y — ^ — 1 + 


TTljcLm — 1 Am) 


, we obtain the result, valid for 


fd(dm—1 &m) 


< 1. For 


^(®m-1 d m ) _ 


= 1, we observe 


that by the tenability assumption on the urn, we obtain for both sampling models the conditions 
d m > 0, and also b 0 > 0. Thus, we get from Proposition 1 

ct do + b 0 Tn(dm—i dm) T d m T bo E m(d m -i d m )? 

such that d m = b 0 = 0, and the result follows directly from (8). 

In order to obtain asymptotic expansions, we only need Stirling’s formula for the Gamma 


function: 

^ (f) 

Hence, we obtain 


1 1 


fl+7^ 


+ 




\z\ —>■ oo. 


sfz V 122 288A 

1 r(^)r(n + T » +m( °;--°" ) ) r(f) , gf 1 

Qn r(n -f- — ) p ^ To+m(a m -i —a m ) ^ p^ To-\-m(a m — 1 O^m ) ^ \ \ 77, 


yielding the stated result. 

Proposition 3. For balanced affine urn schemes, the second moment of W n is 


□ 


mi\ = 


^n-l+Ai^ ^n-l+A 2 j 
( n -l + 3l=i 


wS 


E 


^ mi-i]+ot'^){ 


(:i 11 


/ra—1+ —\ Ai—1+^—-\ V u ' ' G _1 +-M G _1 + A A 

( n " ) ( n " ) V J=1 V i A i 1 

y. riij . i \ ^7l(dm —1 d'm)ftUg y it ^ A/l-(-4?7l(dm— 1 dm)(dm —1 Urn'll) , 

for model M, with Ai i2 = —--- 2 2 v —-— --- and 

2ma m (dm-i dm) 


n / \l( m m- \ 

fin ~ ( d m -l ~ d m ) ^ Tfd J "h 

A—1 


T n -1 


+ 2d r 


(9) 
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For model 7Z, the second moment ofW n is 

/n-l+jUA tn-l+H 2 \ 

E [Wl) = K n_1 A n ’ 


(-':") [ % 

with fi-i 2 = 11 - - ancl 




(dm—1 ®m) d/. | 2?7l(2 m (o m _i d m ) : ^^ 

Pn = ^ T ^ t Zd m . 


Tn -1 


Tn—l 


( 10 ) 


Proof. We use the stochastic recurrence (5), with s = 2, and obtain for the conditional expecta¬ 
tion the equation 

m 

E[W> | F n _J = W n 2 -r + ^(2W n _ iam _* + a 2 m _ k ) E[lI n (VF fc -B m_fe ) | F n _J . 

k =0 

By Proposition 1 and the affinity condition, we further get 

E[W 2 | F n _i] = W 2 _j + a m (2W ri _i + a m ) 


T ^ ^ (fc (d m — 1 d m ) "F 2fc(a m _i d m )(dm T l^n— l)) (11) 

fc =0 

x E[l n (VF fe 5 m - fe ) |F n _J. 

The sums depend on the particular sampling model. According to (3), for model Ad , the number 
of drawn white balls in the sample of size m is given by a hypergeometric distribution with 
parameters T n _ x , W n _ i and m. Alternatively, for model 7?., the number of drawn white balls in 
the sample of size m is given by a binomial distribution with parameters m and W n _i/T n _,. 

We take expectations and use the results of Section 2 to simplify the sums. Consequently, we 
obtain for both models a linear recurrence relation of the form 


E[1C 2 ] = a„ E[W 2 ^] + p n EtWn-i] + 7n , n > 1, 
with E[14 7 n ] as given in Proposition 2. For model M , the sequences a n and 7 n are given by 

1 (dm—1 d m ) 2(a m _i d m )/7T 2 

«n = 1 H--1- m , In = a m , 


ATI 2^ 

1 n-l 

and /3 n as stated in (9). For model 72, we have 


T n - i 


1 (dm—1 dm) tTl 2(a m _i Cl m )77l 2 

«n = H - ^9 - 1-^ -, 7n = d , 


T 2 

J n—1 


T n —\ 


and fd n as stated in (10). The recurrence relation for E[VF 2 ] is readily solved in a manner similar 
to that we used to solve (7), and we obtain 
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with E[W„] given by Proposition 2. Finally, we simplify the products YYi=i a t by viewing a n as 
a rational function in the variable n, and factorizing it into linear terms of the forms 

(n — 1 + Ai)(n - 1 + A 2 ) _ (n - 1 + /ii)(n - 1 + p 2 ) 

^ ~ (n-l + ?)(n-l + ^)’ and “ (n -1 + ^ ’ 

for models M and 1Z, respectively. □ 


3.3. Martingale structure. Next, we deduce from the linear affine structure of the conditional 
expected value of W n and the previous result for the expected value the following result. 

Proposition 4. For balanced affine urn schemes with a m f 0, the centered random variable 

n 

W n = g n (W n - E[W n ]) = g n W n - a m g 3 - W 0 , 

3 = 1 

with g n as defined in (6), is a martingale with respect to the natural filtration: E[W n | F n _i] = 
W„_i, n > 1, and Wo = 0. 

For balanced affine urn schemes with a m = 0, the random variable 2lT n = g n W n is a non- 
negative martingale and converges almost surely to a limit SU.^. 


Proof of Proposition 4. By Proposition 1 the conditional expectation is given by 

H 7 r T ir | ttti i _ T T j (T n ~i + — a m ) \ 

E | E n _]J 11 ;i i ( — I 4~ a, m , 

for n > 1. As in the proof of Proposition 2, we obtain 

E^nbEn | E n _iJ g n —\W n — i -\- g n t^mi 

By definition 


n > 1, 


Wn = g n W n - dm ^ gj - Wo, 
3 = 1 


and we get the representation 


n —1 


( 12 ) 


gn E | Efj_i j Urn ^ ^ fjj 1'Fo gn—W^n—1 ^ ^ g j I'l'Ai 

3 = 1 i =1 

such that 

E[W n | F n ,_ 1 ] = W n _L 

We also note that Wo = go{Wo — E[VF 0 ]) = 0, and so E[W n ] = 0. Moreover, 

E[\W n \]= g n E[\W n -E[W n ]\]. 

For a m = 0, we note that W n > 0 and also 2B n = g n W n > 0. By martingale theory, Q2J n 
converges almost surely to a limit: 2U n -^4 QUoo. □ 
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4. Limit theorems 


In this section, we discuss limit theorems for the number of white balls. Our limit theorems 
are valid for arbitrary m > 1, unifying the earlier observed phenomena for the case m — 1, and 
covering new such cases, as well as extending the result to larger sample size. For balanced urn 

/ a b N 
d, 


models and a single ball in the sample, one considers the ball replacement matrix M = 


with balance factor a, with eigenvalues Ai = a and A 2 = a — c. For this classic case, there is a 
known trichotomy [2, 3,11, 12, 15]: (1) triangular urn models with a nongaussian limit for c = 0 
(or b = 0), (2) the so-called small urns with a Gaussian limit for c > 0 and A 2 /A x <i and (3) 
the so-called large urns with a nongaussian limit for c > 0 and A 2 /Ai > i Note that owing to 
the balance, the urn actually has only three parameters a, c and a. The terms “small urns” and 
“large urn” were used by other researchers [2], We prefer to think of the ratio of eigenvalues as 
an index and refer to urns with small versus large index. It is the index that can be large or small, 
not the physical container (urn, box, etc.). 

For urn models with multiple drawings and affine expectation, we obtain a similar charac¬ 
terization. By Proposition 1, our class of urns is determined by a m _i, a m and the balance fac¬ 
tor cr, satisfying the affinity condition a k = (rri — k)(a m - i — a m ) + a m , 0 < k < m. We call 

A = ( CLm ~ 1 1 \ the reduced ball replacement matrix. For the affine subclass of balanced 

y Om J 

urn models, the eigenvalues of A are A x = a and A 2 = a m _ x — a m . It turns out that the behaviour 
of the urn critically depends on the urn index A given by the ratio A 2 /Ai of the two eigenvalues 
of A times the sample size m: 

TYl 

A = A(m, a ) := — (a m -i - a m ). 
a 

This parameter governs the growth of the second largest term in the asymptotic expansion of the 
expected value. For instance, in terms of this index, the expectation in Proposition 2 is 


E[W n ] = 


A 


n + 0{n A ) + 0(1). 


In the following we obtain a central limit theorem for urn models with “small index” A < I and 
“critical index” A = Note that the case A = 0 is excluded from our considerations because 
it leads to a m = a m _i and by the affinity condition to a k = a m , 0 < k < m; thus we have 
deterministic development: W n = W 0 + a m n. We also obtain almost sure and L 2 -convergence 
for “large index” urns A > I . We call an urn model triangular if a m = 0 or b (} = 0 (or both). We 
already obtained for a m = 0 almost sure convergence in Proposition 4. Since B n = T n — W n , 
we can reduce the case b 0 = 0 and a m > 0 by reversing the colors to a m = 0 and b 0 > 0. A 
detailed study of the moment structure of large index urns and the triangular urns with a m = 0 
(and b 0 > 0) will appear in a companion work. 


4.1. Asymptotic expansion of the variance. An asymptotic expansion of the variance of W n 
can be obtained from the explicit expressions for the expected value and the second moment. It 
is required to prove later on a central limit theorem for A <! and almost sure convergence of 
large-index urns. 
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Theorem 1. For balanced affine urn schemes, the variance satisfies the following expansions: 
Small-index urns, the case A < 


v[w n ] = 

Critical-index urns, the case A = 


0 


&nA 2 


m(l — 2A)(1 — A) 2 


n + o(n). 


V[W n ] = a n log 7i + 0(n ), 
m 

Large-index urns, the case A > |: 

V[W n ] = Cn 2A + 0(n ), 

with the constant C being model-dependent given by an infinite sum: 


o-m'I Q p/ Tq 


/ a mJ- 0 

W 2 ~ rnwsw+<- ♦yaw- 1 (aj‘- A + (Wo--tfj-) 


r(?) 

r($+A) 


+ ^((SA - 1 ) + 2 a m (V 0 - 

amTo 2 p2/Tb \ 

(T \ ' (T / 


amT 0 N r( —) 


r (p + A) 


C(A) 


Wn 


i — A/ r 2 (^ + A)’ 


with ((z) denoting the Riemann zeta function and ffi, 'if, E[Wj_i] as given in (9), (10), (14), and 
Proposition 4. 

Proof Our starting point is the expression for E[iy 2 ] in Proposition 3. In order to perform a 
unified analysis for the two models, we use the representation 




3 = 1 




(13) 


with 


Hn + Wn + AJ model>(; 


'fin = 


r (n + T(n + 2Jtl) 

r(n + //i)r(n + jjL 2 ) 


(14) 


model 7£. 


r (™ + ?) 2 

We refine our previous result and observe that the expected value E[W n ] satisfies the asymptotic 
expansion 


nw n ] = 


1 - A 


n + ( W, 


amTo TV Jb A 7 n 

" ) n A + 0m + 0(n~ 1+A ) (15) 

l_ A ; r (^ + A) <i(l-A) + 1 
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Moreover, g n satisfies the asymptotic expansion 

U? + A )..-Af 1 + i 
2 n 


nf ) 


9 n = —n -11 + —A(1-A ) + 0\ — 


2T 0 

cr 


rr 


Furthermore, /3 n satisfies for both um models the asymptotic expansion 


/3„ = 2a m + A(a, - 1+a - ) + 0 (i 


n 


n* 


(16) 


(17) 


We need the expansion 




with the constant M given by 

i(A?-A 1 + Al-A 2 -5 + ?- S S 11! + ? A 1 ), model M ; 

model 7 Z, 


M = 


\ 1 — AT + lA ~ 9-2 ~ 2^1 + 2 A 1 ), 


with A i, ^ as given in Proposition 3. After simplifications it turns out that for both models the 
constant M is given by 

AT A 2 A A " 2AT 0 

M = A 2 - A H-1--. 

m a 

In order to keep track of the different expansions in a readable transparent way, we introduce a 
shorthand notation: 


E[W n ] = E t n + E 2 n A + E 3 + 0(n~ 1+A ), 

Pn — + B 2 n 1 + 0(n J ), 

with constants E u Bi as given in (15) and (17). We note that 

(E[kF„]) 2 = E\n 2 + 2E 1 E 2 n 1+A + 2E 1 E 3 n + E\n 2A + o(n). 

We shall prove that 

E[W n 2 ] = E\n 2 + 2E 1 E 2 n 1+A + <^ n , 

with 

( 7A-47.A + 2 E,E 3 )n + o(n), for A < 1; 


(18) 


<Pr 


— / O-mi 'o ®m) 


nlogn + Opi), 


for A = 
for A > i. 


(C + El)n 2A + (9(n), 

We start with the small-index urns satisfying A <1- Assume first that 0 < A <i We postpone 
the remaining case A < 0 to the end (note that A = 0 leads to a degenerate urn model). The 
expansion of E[W 2 ] is obtained as follows. First, let j = j(n), with j —y oo, and write 

mWj-i] + at 


= B 1 E 1 j 1 ~ 2A + B 1 E 2 j 




-A 


+ (<4 + B\E :i + E 1 B 2 - BiEi - B 1 E 1 M)j~ 2A + 0(j~ L - A ). (19) 


■ —1—A\ 
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Replacing the summands by their asymptotic expansion leads to an error of magnitude 0(1). 
This is fully sufficient for our purpose. Consequently, we use the following identity, which can 
be obtained using the Euler-MacLaurin summation formula (see [9]; Pages 595-596): 

Ef^ + y + CH + OK- 1 ), (20) 

a +1 2 

3 =i 


for a 7 ^— 1 , where Qiz) denotes the Riemann zeta function. We obtain the expansion 


E 

3 =1 


m^tA + al 

V>i 


/ n n \ 

= B 1 E 1 1 -- H-) + BiE 2 

1 1 \ 2 — 2A 2 J 


n 


1 - A 


+ B X E 3 + EiBo - B 1 E i - B x EiM) 


n 


1—2A 


( 21 ) 


+ 0 ( 1 ). 


1 -2A 

Since ypy = Ef and = 2 E X E 2 , we obtain—taking into account the expansion of w n —the 
following: 

mWj-il+al 


52 

3 = 1 


V>7 


= £ 2 n 2 + 2£i£ 2 n 1+A 

' B\E\ 


2 

+ o(n). 


+ ME( + 


2 + B 1 E 3 + E\B 2 — B\E\ — B\E\M ^ 


1 - 2A 


in 


( 22 ) 


Consequently, the first two terms in E[l/F 2 ] — (E[l + n ]) 2 cancel out. Only a leading linear term 
remains in the variance, and its coefficient is 


fB 1 E 1 
V 2 


+ ME\ + 


+ B 1 E 3 + E\B 2 — B\E\ — B\E\M 
1 - 2A 


2E1E3. 


The stated result follows after simplification aided by a computer algebra system and using the 
fact that b 0 = a( 1 — A) — a m . 

For A < 0 we can proceed in a similar way. The expansion (19) is still valid. The only 
difference is that the magnitude of the error is larger and of order 0(n -A ) in(21). Nevertheless, 
the resulting expansion ( 22 ) is still valid due to the multiplication with i( 3 n ~ n 2A . 

For A = we proceed in a similar way. We use the identity 


52^ 1 — !nn + 7 + Oy—J, 
3 =1 


where 7 denotes the Euler-Mascheroni constant. We have B 1 E 1 = 2a rn 
also 2 BiE 2 = 2 EiE 2 , such that 

mWj-i] + al 


( 1 —A ) 2 


(23) 


= Ef, and 


3 = 1 


^ 3 


= E\n 2 + 2E\E 2 n 


l+A 
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+ (a 2 m + BiE 3 + EiB 2 — B x Ei — BiEiM^n Inn + 0{n). 

Consequently, the first two terms in E[W 2 ] — E[IU n ] 2 cancel out again, and the important constant 
is given by 

+ B\E 3 + E X B 2 — B\E\ — B 1 E 1 M. 

The stated result is obtained after simplification. 

For large-index ums A > | we cannot neglect errors of magnitude 0(1) as in the case 0 < A < 
In order to deal with the cancellations, we adapt (13) and use a different exact representation 


EM 


A mWj-i] + < - + E 2 ) 

vv 0 -TiPn/ , 

00 “ 0 j 

3 =1 7 


+ ^ n B 1 ^(E 1 j 1 - 2A + E 2j - A ). 

3 = 1 

Owing to (19) we know that the first sum is convergent by the comparison test. Application 
of (20) to the second sum gives 


0n#i + E 2 j- A ) = E\n 2 + 2E 1 E 2 n 1+A + B 1 (E 1 C(2A -1) + E 2 ((A))n 2A + o(n 2A ). 

3 =1 


The first two terms in EfW 7 ^] — (E[iy n ]) 2 cancel out, and the constant C is given by 

+ Bi(£iC(2A - 1) + E 2 C(A)) - El 


Wq + ^ ^-E [Wj^+al - i{3jBij- A (Eij 1 ~ A + E 2 


"00 


3 =1 


07 


which proves the stated result. 


□ 


4.2. Almost-sure convergence of nontriangular urns. For triangular urns with a m = 0 (or 
b 0 = 0 or both) we have already obtained a limit theorem for W n via the martingale in Proposi¬ 
tion 4. A first byproduct of the previous result concerning the first and second moment is a limit 
theorem for W n /T n for a rn 0 0 (and 6 0 0 0). 


Proposition 5. Let W n be the number of white balls in the urn after n draws. For nontriangular 
balanced affine urn models with A < 1 the ratio of white balls W n over the total number T n = 
T 0 + ncr after n draws converges almost surely: 

IFn tti m Qj m 

Tn > cr(l - A) a m + b 0 ■ 


Proof. Following [5] we use supermartingale theory to obtain the stated result. We only present 
the computation for model TZ, the proof for model J\A is very similar. The following computations 
are somewhat lengthy, and preferably carried out with the help of a computer algebra system. Let 
Z n — ^ • Using Proposition 1 we obtain 


E [Z n | F n _i] 



crA 

T n -1 



a 

T~ 

- 1 n 


Z n —\ 


T n -1 + crA 

Tn. 


Z n ~ i- 
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Furthermore, in a manner similar to the proof of Proposition 3, we get 


E[Z 2 |F„_,] = 


T 2 ] ( 2oA A 2 (m — 1W 2 
n 1 ' i 3-b 


T 2 ’ T n _! ' mT 2 


n —1 


y2 

“n- 1 


A 2 o r (o r (l - A) - 2a m ) a m A 2 (o(l - A) 

“^n-l+‘ 


(1 - A)mT 2 


(1 - A) 2 mT 2 


Hence, 

wr' 72 iTF i ^ {T n -i + crA)" 2 A 2 a(cr(l — A) — 2a m ) a m A J (a(l — A) — a r 

I ^ n-lj S -- Z n-1 3--iA-AU-1 + 


T 2 (1-AW (l-AMJ • 

Now we use the fact that cr(l — A) — a m = b 0 > 0 and also a m > 0. Moreover, we know that 
0 < y* < 1, and consequently — < Z n < 1 — Thus, there exists a constant 

K\ = Ki(m, a m _i, a m , a)—independent of n —such that 


EKIF^J < 
< 


^-i + 


(T n _i + a A) 2 7 2 { A 2 cr|(6o - a m )Z n -i \ + a m b 0 A 2 

7^2 
± n 

{Tn-i + <tA) 2 


(1 - A)roT 2 


T 2 


y 2 i 
^n- 1 ~r ■ 


There exists a constant k 2 > 0 such that fr- + < ^ 7 . Let c n = (Tn with 0 < c„ < 1 


and d n — ^ > 0. We have 


Tn-l ‘ 


E [ Z n + Y I F„-J < C„Z 2 _, + 4 f < Z 2 _, + £ + £ < ZLl + K2 


T 

-t rj 


T 2 T„ 


T n -1 

2 


Hence, Z 2 + |r- is a positive supermartingale, which converges almost surely. Thus, Z 2 converges 
almost surely. Let lim^oc Z 2 = Z almost surely. Following [5], we next prove that E|Z 2 ] —^ 
0. By dominated convergence this is sufficient to obtain the stated result since it implies that 
E[Z] = 0 and so Z = 0 almost surely, such that Z„ converges to 0 almost surely. We have 

F[Z“] < c n E[Z 2 _ x ] + d n . 

By the comparison theorem 






< OO. 


Moreover, 


n 


{T k -! + aAf 

n 


n =1 


n 


(no + To) 5 

(fc + A-l + ^) 2 r(n + A + —) 2 r(l + —) 2 


fc=1 fc=1 (k + ?) 2 r(A + ?) 2 r(n +1 + ^)2- 

Thus, we can use the following lemma —also used in [5]— to show that E| Z 2 ] —> 0 and to finish 
our proof. 


Lemma 2 ([4]). Suppose {x r ,)n>\, {c n } n >i and {d n ) n >\ are nonnegative real sequences satis¬ 
fying x n+ i < c n x n + d n , where 0 < c n < 1 for n > 1. IfYYi=i c % —>■ 0 and d n < 00 , then 
x n —> 0. 
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By Stirling’s formula the product n)c=i ^ ^ 2 °^ satisfies the asymptotic expansions 


n 

k=l 


{Tk- 1 + crA) 2 _ 1+A 

- Y2 - K z n ’ 


for some constant n 3 . Since A < 1 the product tends to zero, and so does EjZ 2 ]. □ 

4.3. Almost-sure convergence for urns with large index. 

Theorem 2. For nontriangular balanced affine urn models with a large index | < A < 1 the 
random variable W n = g n (W n — E[W n ]) converges almost surely and in L 2 to a limit Woo. 

Proof. By Proposition 4, W n is a martingale. Hence, by martingale theory it suffices to prove 
that 




< oo 


n =1 


in order to prove almost-sure and C 2 convergence (see Chapter 10 of [25]). We use a standard 
argument: 

E[(VW„) 2 I F n _J = E[(W n — W rt _i) 2 I F n _J =E[W 2 -2W„W n _i + W 2 _ 1 |F n _J. 

By the martingale property we get further 

E[(VW„) 2 I Fn-J = E[ W 2 I F n _i] -2W n _iE[W n | F n _J + W 2 _ X = E[W 2 | F n _J -W 2 _ x . 

This implies that 

E[(VW n ) 2 ] =E[W 2 ] — E[W 2 _ x ] , n> 1. 

Using the fact that Wo = 0 we obtain 

N 

^E[(VW„) 2 ] = E [W^r] =g 2 N V[W N }. 


n= 1 


By the asymptotic expansion (16) and of V[W n ] we observe that 


N 


p2/Tb 

Jim^E[(VW„) 2 ] = C < oo, 


n =1 


r 2 (^) 


□ 


with C as given in Theorem 1. 

Some corollaries of the relatively small variance for A < ^ are helpful in deriving further 
distributional results. 

Corollary 1. Let W n be the number of white balls in the urn after n draws, Then, for A < L we 


have~ 


W n = 


A 


n + 0 Cl (VV[W n ])-- 


1 - A 


n + 


A < I; 


OcAVn), 

0 Cl {sJn Inn), A = |, 


2 By saying a sequence of random variables Y n is Oc , ( g{n )), we mean there exist a positive constant C and a 
positive integer no, such that E[|F ra |] < C\g{n)\, for all n > no. 










20 


M. KUBA AND H. M. MAHMOUD 


and 


w„ = 


1 - A 


' n 2 + 0 Cl (nH[W n )) = 


1 - A 


n + 


C>£i(ni), A < |; 

Oc { (rJ Inn), A = \. 


Proof. From the asymptotics of the mean and variance, as given in Proposition (2), (15) and 
Theorem 1, for large n we have 


E 


W n 


1 - A 


n 


= E 


(W n - E [W n ]) + (E [W n ] - n 


= V[W n ] + [E[W, 

= o(y\w n }). 


1 - A 


n 


(24) 


So, by Jensen’s inequality 


E 


W n - 


1 - A 


n 


< WE 


W n - 


1 - A 


n 


= 0(V[W„1), 


and this implies 


W n = 


n + 0 Cl iy[W n }). 


n 1 - A 
The second part follows by squaring. 

Corollary 2. Let W n be the number of white balls in the urn after n draws. Then, we have 

Li 


□ 


n 


1 - A’ 


and 


W* Li / a. 


n 


1 - A 


So, both convergences occur in probability, too. 


4.4. Martingale central limit theorem. We follow the approach used in [17, 20] used for spe¬ 
cial ums. We would obtain a Gaussian law for Wj, if a set of conditions for the martingale 
central limit theorems are satisfied. There is more than one such set (see [10]). One set of 
such conditions convenient in our work is the combined conditional Lindeberg’s condition and 
the conditional variance condition. The conditional Lindeberg condition requires that, for some 
positive increasing sequence and for all c > 0, 


U n -=Y. E 

3 = 1 


u 


2 

I 


vw. 



F.-i 



0 , 


and the conditional variance condition requires that, for some square integrable random variable 
Y f 0, we have 


V n :=Y, E 

3=1 


w Vj 

Zn 




Y. 
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When these conditions are satisfied, we get 


W n - n D „ 

S n 


where the right-hand side is a mixture of normals, with Y being the mixer. It will turn out that 
the correct scale factors are 


£n 


ns a 5 for A < 2, 

Inn, forA = 2 . 


(25) 


Lemma 3. The terms VW 7 satisfy | VWj | < Kj A for some positive constant K and 1 <3< 

n. 


Proof Suppose c Oj = Wj — Wj -1 is the random number of white balls added right after the jth 
drawing. Starting from the definition of Wj, we write the absolute difference as 

|VW,|= iWj-vv,-!! 

j j -1 

^ ^ 9k VFo) ^/7j —1 F l CL m ^ ^ Qk 

k=l k=l 

—1 ) Q J m9j 9j— 1| 

|^J — 9j 9j u j Q'm9j | 

< Tj- 1 I Qj - Qj- il + qgj + qgj-i, 

with q = max 0 < fc < m \a k \. By definition of g n and the asymptotic expansion (16) g 3 = 0(j~ A ) 
and further g 3 — g 3 -i\ = O(j 1 ' A ). Consequently, there exists a constant K > 0 such that 

\VWj\ < Kj- A 


□ 


Lemma 4. 


u n = j2 E 

3 = 1 


VW) 


2 

I 


VM) 



F 


j'-i 



0 . 


Proof Choose any e > 0. Concerning A < \ we distinguish between A < 0 and 0 < A < 
Lemma 3 asserts that for arbitrary j with 1 < j < n 

for A < 0; 
for 0 < A < |; 

<£> for A = |. 

Hence, the sets {| VVVj > e^ n } are all empty, regardless of A < 0, 0 < A < | or A = 2, for 
all n greater than some positive integer n^e). For large n (namely, n > n 0 (c)), we can stop the 


VW, 


< 


K 

n2 ’ 

K 

jA-n? — 
K 

I- 

. j? Inn 
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sum at n 0 (e). By in Lemma 3, we get 

n 0 (e) 


U n = ]Te 

3 = 1 

for n —> oo. 

Lemma 5. 


VW, \ 2 


& 


I 


VW,- 


& 


> £ 


F ? _i 


1 n 0 (e) 

w j=i 


A " 


F 7 _i 


<£!2£M^o, 


e 


□ 


k = £e '^ 2 


j=i 


& 


F.7-1 


QmQ 2 (e r (l - A) - a m ) A 2 
m(l — A) 2 (l — 2 A) 


Proof. Let Q := 


r(^+A) 

r(^) 


. By (16) we have 

9n = Qn~ A + 0(n _A_1 ). 


From this asymptotic relation, we also have 

Vg n = —QArU A_1 + 0(n~ A ~ 2 ). 

As in the proof of Lemma 4, we write 

VWj (A QjbJj Qjm9j— i- 

And so, we can write 


( VW A = ^( A 2 ^- 2 A^ . 


Wi- 1 , W'(i-I) , ..2 , .2 


A 2Acy 


J 


+ ClU — 2a m ujj + a m 


Using the L, approximation in Corollary 2, we write the conditional expectation 


e[(vw,) 2 |f,_i] = S(a 2 ( i ^ 


A 


-2 A 


1 - A 


+ a m j E [u>j | F j _ij + 2A a r 


1 - A 


+ E[o;J | Fj_i] + a 2 n ^j + 0 Ll (^jt) ■ 


We already know exactly the conditional expectations of uij and or from (12) and (11), respec¬ 
tively: 


and 


E[cjj | Fj_i] — — — Wj-i + a m 
1 j -1 

E[c c) | Fj_J = E [W 2 1 F j.f - 2W j - 1 E[W j | F^] + W 2 _, 


— (aj - 2(1 + ——) + 1 )W 2 _t + (f 3 j - 2 a m )Wj-i + 7 j, 

1 j -1 

with model-dependent sequendes aj, 7 j as given in (11) and in Proposition 3. Consequently, 
using asymptotic expansions of ay, / 3j , 7 ^, and Corollary 2 we obtain the following model- 
independent expansions 

E MU-i } = f=f + ° L '(7i)’ 
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and 


^ r 2 i tf i a m (A 2 (a - Ao - a m )+ma m ) ( \\ 

E [uj |F^j =-_ A ^2 - 0li \7])' 


m( 1 — A) 2 

Putting all the elements together and simplifying, we obtain 

,2 


E[(V)V,)''|F H ] = amQV L/ ) E m)A2 +oE- 1 


rnj 2A (1 — A) 2 

Now we can sum using (20) and (23). We get for A < \ 


2A+1 


v _ a m Q 2 (a(l - A) - a m )A 2 /1\ Ll a m Q 2 (a(l - A) - a m )A 2 

n_ m(l — A) 2 (l — 2A) + Ll \n) ^ m{ 1 - A) 2 (l - 2A) ’ 


and, for A = \ we get, 


T/ - ®mQ (2^ ®m) „ ( 1 \ Li ® m Q (2^ ®m) 

Gi =-h f 


m 


logn 


m 


This implies the required convergence in probability. 


□ 


Having checked the two martingale conditions, a Gaussian law follows for the nondegenerate 
cases: 

Wn-gfn D> ,/ a m (a(l — A) — a m )Q 2 A 2 \ 
n |-A V’ m (l — A) 2 (l — 2A) ) 

for A <1 and 

~W n 2Qa m n p ^ \f(c\ ® m)Q \ 

■y/logn V ’ m J 

for A = |. Translating this into a statement on the number of white balls and using b 0 = 
cr(l — A) — a m we get a main result of this investigation. 


Theorem 3. Suppose we have a two-color tenable affine balanced urn that grows by sampling 
sets of size m with or without replacement, and with a small index A < i and does not fall 
in any of the afore-mentioned degenerate cases. 3 Let W n be the number of white balls after n 
draws. For A < \ we have Gaussian laws: 


and 


W n 


l-A 


n 


n 



a m b 0 ffi 2 

m(l — A) 2 (l — 2A) 


) 


for A < 


1 

2 ’ 


W n 2 a m n p ^ \fff\ 

\Jn logn V ’ m 


for A 


1 

2 


b 


’Recall that several degenerate cases are excluded from this study, namely, the triangular case where ciq = 0 or 
o = 0, the zero-balance cases, and the case To + m(a m _i — a m ) < 0. 
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5. Conclusion and Outlook 

5.1. Summary. We studied for a two-color affine linear urn models with multiple drawings— 
sample size m > 1—under two sampling models the distribution of the number of white balls 
W n after n draws. In the following we summarize our findings according to the index A and state 
the order of growth of the expectation and variance. 



A<| 

A = \ 

a>| 

E[wg 

n 

n 

n 

V[Wn] 

n 

n logn 

n 2A 

Limit law 

W„-E[W„] ^ ^ 

V'VfWn] V ’ 

w„-E[w„ ] Af(0, 1) 

y/V[W n \ V ’ ’ 

w n -^4 Woc 


Table 1 . Overview of the result for nontriangular urns a m bo ^ 0. 


Here W„ = g n (W n - E[W n ]) with g n = Tj+m ( a ^_ 1 _ am) ~ Cn A . Note that for non¬ 

normal limit law for large-index urns the distribution will depend on the sampling model; this 
will be discussed in a companion work, as well as the moment structure. 

5.2. Quadratic expected value and beyond. Using Lemma 1 it is possible to extend Propo¬ 
sition 1 to characterize all ball replacement matrices leading to a conditional expected value of 
quadratic type, cubic type, etc., and in general to a polynomial of degree k, with 1 < k < m. 
Beginning with the extension to quadratic types, 

E[VU n | = a n , 2 + Ol n ^W n -\ + 0> n< o, Tl > 1, 

we obtain the same condition for the «/,.’s for both models but different resulting coefficients 
Q n ji, a n ,i and It is possible to obtain a explicit expression for the expected value of W n , but 
the arising formula is very complicated and does not easily seem to lead to precise asymptotic 
expansions. 

5.3. Urns with a large index, triangular urns, and more colors. In the companion work [18], 
we complete the study of balanced urn models with multiple drawings and affine conditional 
expectation. In particular, we provide a detailed analysis of the moments of W n for triangular 
urn models and also for ums with a large index A > 1/2. The analysis is based on the so-called 
method of moments applied to the centered moments of W n and the martingales 2B n , VV n . 

In order to generalize the affinity condition of Proposition 1 to more than two colors it is 
beneficial to rewrite the a*/ s as an affine combination of a 0 and a m : a,k = zr ^ao + ^a m , 
0 < k < m. This idea can be readily generalized to r > 2 colors. One obtains a martingale of 
the form 

E[X n | F n _ x ] = (I + -^A T ) X n _!, 

-L n— 1 
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with X n = (. Xn \ • • •, Xn' l ) T , where Xl!’ denotes the random number of balls colored £ and I 
the identity matrix. The matrix A 1 is a certain r x r- matrix (somewhat similar to the “reduced” 
ball replacement matrix A introduced before), being composed of r vectors appearing in the 
general affinity condition. Compared to the two color case, simple expressions for the (mixed) 
moments of X n do not seem to exist, but it may be possible to study the limitings distribution 
of X n using different methods. 
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